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SECTION  1.  GENERAL  INFORMATION 


1.1  BACKGROUND 

Technologies  under  development  for  the  detection  and  discrimination  of  unexploded 
ordnance  (UXO)  require  testing  so  that  their  performance  can  be  characterized.  To  that  end. 
Standardized  Test  Sites  have  been  developed  at  Aberdeen  Proving  Ground  (APG),  Maryland  and 
U.S.  Army  Yuma  Proving  Ground  (YPG),  Arizona.  These  test  sites  provide  a  diversity  of 
geology,  climate,  terrain,  and  weather  as  well  as  diversity  in  ordnance  and  clutter.  Testing  at 
these  sites  is  independently  administered  and  analyzed  by  the  government  for  the  purposes  of 
characterizing  technologies,  tracking  performance  with  system  development,  comparing 
performance  of  different  systems,  and  comparing  performance  in  different  environments. 

The  Standardized  UXO  Technology  Demonstration  Site  Program  is  a  multi-agency 
program  spearheaded  by  the  U.S.  Army  Environmental  Center  (AEC).  The  U.S.  Army  Aberdeen 
Test  Center  (ATC)  and  the  U.S.  Army  Corps  of  Engineers  Engineering  Research  and  Development 
Center  (ERDC)  provide  programmatic  support.  The  program  is  being  funded  and  supported  by 
the  Environmental  Security  Technology  Certification  Program  (ESTCP),  the  Strategic 
Environmental  Research  and  Development  Program  (SERDP)  and  the  Army  Environmental 
Quality  Technology  Program  (EQT). 

1.2  SCORING  OBJECTIVES 

The  objective  in  the  Standardized  UXO  Technology  Demonstration  Site  Program  is  to 
evaluate  the  detection  and  discrimination  capabilities  of  a  given  technology  under  various  field 
and  soil  conditions.  Inert  munitions  and  clutter  items  are  positioned  in  various  orientations  and 
depths  in  the  ground. 

The  evaluation  objectives  are  as  follows: 

a.  To  determine  detection  and  discrimination  effectiveness  under  realistic  scenarios  that 
vary  targets,  geology,  clutter,  topography,  and  vegetation. 

b.  To  determine  cost,  time,  and  manpower  requirements  to  operate  the  technology. 

c.  To  determine  demonstrator’s  ability  to  analyze  survey  data  in  a  timely  manner  and 
provide  prioritized  “Target  Lists”  with  associated  confidence  levels. 

d.  To  provide  independent  site  management  to  enable  the  collection  of  high  quality, 
ground-truth,  geo-referenced  data  for  post-demonstration  analysis. 

1.2.1  Scoring  Methodology 

a.  The  scoring  of  the  demonstrator’s  performance  is  conducted  in  two  stages.  These  two 
stages  are  termed  the  RESPONSE  STAGE  and  DISCRIMINATION  STAGE.  For  both  stages, 
the  probability  of  detection  (Pa)  and  the  false  alarms  are  reported  as  receiver-operating 
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characteristic  (ROC)  curves.  False  alarms  are  divided  into  those  anomalies  that  correspond  to 
emplaced  clutter  items,  measuring  the  probability  of  false  positive  (Pfp),  and  those  that  do  not 
correspond  to  any  known  item,  termed  background  alarms. 

b.  The  RESPONSE  STAGE  scoring  evaluates  the  ability  of  the  system  to  detect  emplaced 
targets  without  regard  to  ability  to  discriminate  ordnance  from  other  anomalies.  For  the  blind 
grid  RESPONSE  STAGE,  the  demonstrator  provides  the  scoring  committee  with  a  target 
response  from  each  and  every  grid  square  along  with  a  noise  level  below  which  target  responses 
are  deemed  insufficient  to  warrant  further  investigation.  This  list  is  generated  with  minimal 
processing  and,  since  a  value  is  provided  for  every  grid  square,  will  include  signals  both  above 
and  below  the  system  noise  level. 

c.  The  DISCRIMINATION  STAGE  evaluates  the  demonstrator’s  ability  to  correctly 
identify  ordnance  as  such  and  to  reject  clutter.  For  the  blind  grid  DISCRIMINATION  STAGE, 
the  demonstrator  provides  the  scoring  committee  with  the  output  of  the  algorithms  applied  in  the 
discrimination-stage  processing  for  each  grid  square.  The  values  in  this  list  are  prioritized  based 
on  the  demonstrator’s  determination  that  a  grid  square  is  likely  to  contain  ordnance.  Thus, 
higher  output  values  are  indicative  of  higher  confidence  that  an  ordnance  item  is  present  at  the 
specified  location.  For  digital  signal  processing,  priority  ranking  is  based  on  algorithm  output. 
For  other  discrimination  approaches,  priority  ranking  is  based  on  human  (subjective)  judgment. 
The  demonstrator  also  specifies  the  threshold  in  the  prioritized  ranking  that  provides  optimum 
performance,  (i.e.  that  is  expected  to  retain  all  detected  ordnance  and  rejects  the  maximum 
amount  of  clutter). 

d.  The  demonstrator  is  also  scored  on  EFFICIENCY  and  REJECTION  RATIO,  which 
measures  the  effectiveness  of  the  discrimination  stage  processing.  The  goal  of  discrimination  is 
to  retain  the  greatest  number  of  ordnance  detections  from  the  anomaly  list,  while  rejecting  the 
maximum  number  of  anomalies  arising  from  non-ordnance  items.  EFFICIENCY  measures  the 
fraction  of  detected  ordnance  retained  after  discrimination,  while  the  REJECTION  RATIO 
measures  the  fraction  of  false  alarms  rejected.  Both  measures  are  defined  relative  to 
performance  at  the  demonstrator-supplied  level  below  which  all  responses  are  considered  noise, 
i.e.,  the  maximum  ordnance  detectable  by  the  sensor  and  its  accompanying  false  positive  rate  or 
background  alarm  rate. 

e.  Based  on  configuration  of  the  ground  truth  at  the  standardized  sites  and  the  defined 
scoring  methodology,  there  exists  the  possibility  of  having  anomalies  within  overlapping  halos 
and/or  multiple  anomalies  within  halos.  In  these  cases,  the  following  scoring  logic  is 
implemented: 

(1)  In  situations  where  multiple  anomalies  exist  within  a  single  Rhaio,  the  anomaly  with 
the  strongest  response  or  highest  ranking  will  be  assigned  to  that  particular  ground  truth  item. 

(2)  For  overlapping  Rhaio  situations,  ordnance  has  precedence  over  clutter.  The  anomaly 
with  the  strongest  response  or  highest  ranking  that  is  closest  to  the  center  of  a  particular  ground 
truth  item  gets  assigned  to  that  item.  Remaining  anomalies  are  retained  until  all  matching  is 
complete. 


2 


(3)  Anomalies  located  within  any  Rhaio  that  do  not  get  associated  with  a  particular  ground 
truth  item  are  thrown  out  and  are  not  considered  in  the  analysis. 

f.  All  scoring  factors  are  generated  utilizing  the  Standardized  UXO  Probability  and  Plot 
Program,  version  3.1.1. 

1.2.2  Scoring  Factors 

Factors  to  be  measured  and  evaluated  as  part  of  this  demonstration  include: 

a.  Response  Stage  ROC  curves: 

(1)  Probability  of  Detection  (Pdres). 

(2)  Probability  of  False  Positive  (Pfpres). 

(3)  Background  Alarm  Rate  (BARres)  or  Probability  of  Background  Alarm  (PBAres)- 

b.  Discrimination  Stage  ROC  curves: 

(1)  Probability  of  Detection  (Pdd,sc). 

(2)  Probability  of  False  Positive  (Pfpdisc). 

(3)  Background  Alarm  Rate  (BARd,sc)  or  Probability  of  Background  Alarm  (PBAd'sc)- 

c.  Metrics: 

(1)  Efficiency  (E). 

(2)  False  Positive  Rejection  Rate  (Rfp). 

(3)  Background  Alarm  Rejection  Rate  (Rba)- 

d.  Other: 

(1)  Probability  of  Detection  by  Size  and  Depth. 

(2)  Classification  by  type  (i.e.,  20-,  40-,  105-mm,  etc.). 

(3)  Location  accuracy. 

(4)  Equipment  setup,  calibration  time  and  corresponding  man-hour  requirements. 

(5)  Survey  time  and  corresponding  man-hour  requirements. 
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(6)  Reacquisition/resurvey  time  and  man-hour  requirements  (if  any). 

(7)  Downtime  due  to  system  malfunctions  and  maintenance  requirements. 

1.3  STANDARD  AND  NONSTANDARD  INERT  ORDNANCE  TARGETS 

The  standard  and  nonstandard  ordnance  items  emplaced  in  the  test  areas  are  listed  in 
Table  1.  Standardized  targets  are  members  of  a  set  of  specific  ordnance  items  that  have  identical 
properties  to  all  other  items  in  the  set  (caliber,  configuration,  size,  weight,  aspect  ratio,  material, 
filler,  magnetic  remanence,  and  nomenclature).  Nonstandard  targets  are  inert  ordnance  items 
having  properties  that  differ  from  those  in  the  set  of  standardized  targets. 


TABLE  1.  INERT  ORDNANCE  TARGETS 


Standard  Type 

Nonstandard  (NS) 

20-rnm  Projectile  M55 

20-mm  Projectile  M55 

20-mm  Projectile  M97 

40-mm  Grenades  M385 

40-mm  Grenades  M385 

40-mm  Projectile  MKII  Bodies 

40-mm  Projectile  M813 

BDU-28  Submunition 

BLU-26  Submunition 

M42  Submunition 

57-mm  Projectile  APC  M86 

60- mm  Mortar  M49A3 

60- mm  Mortar  (JPG) 

60-mm  Mortar  M49 

2.75-inch  Rocket  M230 

2.75-inch  Rocket  M230 

2.75-inch  Rocket  XM229 

MK  118  ROCKEYE 

81 -mm  Mortar  M374 

81-mm  Mortar  (JPG) 

81-mm  Mortar  M374 

105-mm  Heat  Rounds  M456 

105 -mm  Projectile  M60 

105-mm  Projectile  M60 

155-mm  Projectile  M483A1 

155-mm  Projectile  M483A 

500-lb  Bomb 

JPG  =  Jefferson  Proving  Ground 
HEAT  =  high-explosive,  antitank 
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SECTION  2.  DEMONSTRATION 


2.1  DEMONSTRATOR  INFORMATION 

2.1.1  Demonstrator  Point  of  Contact  (POC)  and  Address 

POC:  Mr.  Rob  Siegel 

617-618-4662 

Address:  GEO-CENTERS,  INC. 

7  Wells  Avenue 
Newton,  MA  02459 

2.1.2  System  Description  (provided  by  demonstrator) 

The  Simultaneous  Multi-sensor  Surface  Towed  Ordnance  Location  System  (STOLS)  is 
a  Global  Positioning  System  (GPS)- integrated  vehicular  towed  array  with  the  unique  capability 
to  simultaneously  co-deploy  total  field  magnetometers  and  electromagnetic  (EM)61  sensors 
on  a  common  platform.  This  approach  combines  the  two  sensors  that  have  been  demonstrated  by 
multiple  tests  at  JPG  in  the  1990s  to  be  the  most  effective  against  UXO,  and  results 
in,  effectively,  two  surveys  for  the  price  of  one.  This  significantly  improves  site  characterization 
and  potential  detection  capability  while  reducing  cost.  The  system  was  developed  by 
GEO-CENTERS  and  Corps  of  Engineers-Huntsville  Center  (CEHNC)  under  Environmental 
Security  Technology  Certification  Projects  (ESTCP)  project  UX-0208,  the  goal  of  which  was  to 
integrate  EM61s  into  GEO-CENTERS’  existing  STOLS  towed  magnetometer  array.  Normally, 
commercial  off-the-shelf  (COTS)  EM61s  and  magnetometers  cannot  be  co-deployed  due  to  the 
noise  engendered  in  the  magnetometer  data  by  the  EM61’s  transmit  pulses,  but  under  the 
ESTCP-funded  project,  custom  electronics  were  developed  that  interleave  the  two  data  streams, 
effectively  sampling  the  magnetometers  only  during  the  period  when  the  EM61s  are  quiet.  Also 
funded  was  the  development  of  a  fiberglass  proof-of-concept  platform  to  host  both  the 
magnetometers  in  a  very  low-noise  environment.  Major  portions  of  GEO-CENTERS  original 
STOLS  magnetometer-only  towed  array  were  utilized;  the  existing  aluminum-framed 
low-magnetic  self-signature  tow  vehicle,  five  cesium  vapor  total  field  magnetometers,  three 
channels  of  EM61  MK1  (single  time  gate)  electronics,  three  1/2  by  1/2  meter  coils,  Trimble  real 
time  kinematic  (RTK)  equipped  GPS  capable  of  centimeter-level  accuracy  in  real  time,  and  data 
acquisition  and  data  processing  infrastructure  were  leveraged  by  the  ESTCP-funded  effort 
(fig.  1).  The  system  also  uses  a  stationary  reference  magnetometer  to  track  the  diurnal  variations 
of  the  Earth’s  ambient  magnetic  field.  These  are  later  subtracted  from  the  vehicle  data  during 
processing. 

The  ESTCP-funded  system  has  been  significantly  improved  through  an  ongoing 
Cooperative  Research  and  Development  Agreement  (CRADA)  between  CEHNC  and 
GEO-CENTERS.  These  improvements  include  updating  the  EM61  system  to  include  five  1  by 
1/2  meter  coils  (making  the  EM  swath  the  same  as  the  magnetometer  swath  width)  driven  by 
MKII  multiple  time  gate  electronics,  the  addition  of  a  suspension  to  the  original  proof-of-concept 
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fiberglass  towed  platform,  a  ruggedized  computer  for  data  acquisition,  and  powering  all  EM61 
electronics  off  a  common  isolated  battery  to  eliminate  drift  and  mitigate  noise.  The  purchase  of 
the  new  EM61  hardware  was  funded  by  ATC  through  the  Army  EQT  program. 


Figure  1.  Demonstrator’s  system,  STOLS/towed  array. 


Spacing  and  Sampling  Rate:  The  magnetometers  and  EM61  coils  are  each  at  1/2  meter 
spacing  cross-track,  with  the  five  EM61  coils  along  the  center  line  of  the  five  magnetometers. 
The  GPS  antenna  is  directly  over  the  center  magnetometer.  The  down-track  separation  between 
the  magnetometer  array  and  the  EM61  array  is  currently  8  feet,  though  this  is  an  overly 
conservative  artifact  of  the  original  ESTCP-funded  design.  Since  the  synchronized  electronics 
sample  the  magnetometers  during  the  period  when  the  EM61  transmit  pulse  is  quiet,  the 
magnetometer  sampling  rate  is  the  same  as  the  EM61  transmit  pulse  rate  -  namely,  75  Hz.  Like 
all  COTS  EM61s,  the  electronics  average  the  data  until  they  receive  a  signal  from  a  tick  wheel. 
An  electrical  circuit  is  used  to  divide  the  GPS  1  PPS  into  a  10  Hz  tick  signal  and  trigger  the 
EM61  to  output  data.  Thus,  the  EM61  data  output  rate  is  10  Hz. 

2.1.3  Data  Processing  Description  (provided  by  demonstrator) 

Multi-sensor  vehicular  survey  data  and  the  diurnal  variation  data.  GPS  data  are  read  and 
converted  into  universal  transverse  mercator  (UTM)  coordinates  to  determine  site  physical 
extent.  Sensor  and  position  data  are  then  processed  and  interpolated.  The  software  then  sets  up  a 
site  (a  grid  in  memory)  which  wholly  contains  the  surveyed  data.  Then  the  position  data  are 
examined  and  corrected  as  needed.  Automatic  correction  examines  the  position  data  for  jumps 
greater  than  expected  for  typical  survey  speeds  up  to  12  miles  per  hour.  The  heading  between 
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updates  is  determined  and  the  position  of  the  75  Hz  magnetometer  and  10  Hz  EM  samples  are 
calculated.  If  large  jumps  in  the  position  data  are  encountered  (e.g.  jumps  caused  by  short-term 
differential  dropouts),  the  operator  is  asked  to  examine  the  data  and  manually  correct  a  bad  point 
by  forcing  it  to  align  with  the  normal  survey  line.  The  corrected  navigation  data  is  then  saved 
with  the  sensor  data  in  a  new  file. 

The  magnetometer  portion  of  the  new  navigation-corrected  file  is  then  processed  with  the 
temporally  registered  diurnal  variation  data.  The  diurnal  data  are  subtracted  from  the  survey 
magnetometer  data  to  eliminate  the  effects  of  changes  to  the  Earth’s  magnetic  field  during  the 
course  of  the  survey  and  to  normalize  the  data  around  zero  gamma.  The  diumally  corrected  data 
are  then  interpolated  into  a  10  cm  grid  for  image  display.  A  linear  interpolation  is  used,  with 
an  interpolation  window  of  +/-  30  cm.  This  interpolation  window  functions  in  both 
directions  -  interpolation  is  performed  cross-track  (between  the  sensors  spaced  1/2  meter  apart) 
as  well  as  along  the  direction  of  travel  (between  the  75  Hz  magnetometer  or  10  Hz  EM  updates). 
The  final  interpolated  image  is  displayed  and  written  as  a  separate  file.  Additional  processing 
steps  are  sometimes  used  to  create  the  best  possible  interpolated  image.  This  sometimes 
involves  removing  small  inter-magnetometer  biases  from  the  data  to  correct  for  minor  sensor-to- 
sensor  differences,  removing  small  directional  offsets  from  the  data,  and  running  a  median  filter 
on  the  time-series  sensor  updates  to  remove  spurious  data  values.  The  interpolated  images  will 
be  examined  and  a  judgment  will  be  made  as  to  whether  either  of  these  or  any  other  additional 
techniques  are  required.  The  EM  portion  of  the  data  file  will  be  processed  in  a  similar  fashion 
except  that  no  diurnal  variation  data  will  be  subtracted. 

For  this  YPG  exercise,  processed  data  will  be  given  to  Dr.  Steven  Billings  and  Dr.  Leonard 
Pasion,  both  of  the  University  of  British  Columbia  and  Sky  Research,  Inc.  Dr.  Billings  and  Dr. 
Pasion  will  process  both  the  magnetometer  and  EM61  data  via  inverse  modeling  techniques. 
Existing  algorithms  have  been  developed  to  use  the  degree  of  remnant  magnetization  as  a 
discriminator  of  UXO  from  clutter,  though  the  direct  applicability  of  this  technique  to  the  APG 
site,  where  ordnance  has  been  seeded  and  thus  has  not  lost  its  moment  due  to  shock 
demagnetization,  is  unknown.  The  beta  technique,  where  the  EM61  data  is  inverted  and 
parameters  related  to  object  symmetry  are  used  as  UXO/clutter  discriminators,  will  also  be 
employed.  In  addition,  Billings  and  Pasion  will  attempt  to  perform  a  cooperative  inversion  of 
both  data  sets.  Plans  are  also  to  employ  a  statistical  classifier  for  the  discrimination. 

2.1.4  Data  Submission  Format 


Data  were  submitted  for  scoring  in  accordance  with  data  submission  protocols  outlined  in 
the  Standardized  UXO  Technology  Demonstration  Site  Handbook.  These  submitted  data  are  not 
included  in  this  report  in  order  to  protect  ground  truth  information. 

2.1.5  Demonstrator  Quality  Assurance  (QA)  and  Quality  Control  (OC)  (provided  by 

demonstrator) 

An  automated  data  quality  program  examines  the  data  and  reports  out-of-range 
magnetometer  readings  and  bad  (nondifferential)  position  readings.  This  gives  a  quick  and 
convenient  benchmark  on  out-of-range  data  that  may  be  indicative  of  navigation  or  sensor  errors. 
Typically  this  report  is  small  enough  to  be  entered  manually  into  the  site  data  processing  and 
archive  log. 
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Multi-sensor  vehicular  STOLS  is  a  self-contained  geophysical  survey  system  that  hosts  up 
to  five  magnetometers  and  five  1  by  1/2  meter  EM61  coils,  a  RTK  differential  GPS,  an 
embedded  computer/data  logger,  and  operator  input/output  devices. 

As  deployed,  STOLS  performs  continuous  QCs  with  immediate  operator  feedback  on 
system  status.  In  addition  to  this  self-monitoring  feature,  STOLS  is  set  up  with  a  comprehensive 
set  of  checklists  for  the  Base  Navigation  Station,  the  Diurnal  (magnetometer)  Reference  Station, 
the  STOLS  Field  Technician,  and  Data  Management.  These  checklists  are  filed  daily  and  are 
available  for  review.  Among  the  functionality  that  the  checklists  ensure  are: 

•  Base  GPS  reference  position  and  pseudo  range  correction  values.  If  the  reference 
position  does  not  match  the  checklist,  it  is  adjusted  and  verified.  If  the  pseudo  range 
correction  values  are  excessive  (any  one  correction  value  greater  than  100  meters),  the 
Base  GPS  reference  position  is  checked  again.  This  process  insures  that  the  Base  GPS 
is  performing  within  its  performance  envelope. 

•  Diurnal  variation  (reference  magnetometer)  station  time  synchronization  with  GPS  time 
is  verified,  tuning  value  is  checked,  and  initial  battery  and  field  strength  values 
recorded. 

•  Multisensor  STOLS  is  set  up  with  a  comprehensive  field  technician  checklist.  Data 
values  are  displayed  on  the  screen  during  data  acquisition. 

•  Because  STOLS  uses  the  GPS  for  position  mapping  sensor  survey  data,  daily  survey 
plans  will  be  guided  by  the  use  of  commercially  available  satellite  planning  software 
(Trimble’s  QuickPlan).  This  program  allows  the  survey  work  to  be  scheduled  during 
hours  of  peak  GPS  coverage,  hence  optimum  positioning  performance.  Predicted 
positioning  performance  is  determined  by  a  GPS  positioning  accuracy  parameter  called 
Position  Dilution  of  Precession  (PDOP).  PDOP  values  are  predicted  based  on  the 
general  site  location  (WGS84  LAT/LON),  time  of  day,  number  of  available  satellites  in 
view,  and  satellite  geometry. 

•  Seeded  with  the  site  location,  a  current  GPS  ephemeris  file  (current  satellite 
constellation  map  available  on-line  or  from  the  GPS  receiver),  minimum  satellite 
elevation,  and  current  date,  QuickPlan  displays  the  number  of  satellites  in  view  and  the 
corresponding  PDOP  for  every  moment  of  the  day.  PDOP  values  greater  than  7.0  are 
used  as  an  upper  limit  for  acceptable  positioning  accuracy  (lower  PDOP  values  indicate 
higher  positioning  accuracy). 

Note:  The  GPS  rover  receiver  in  the  tow  vehicle  is  programmed  with  a  PDOP  mask  of  7.0.  If 
this  value  is  exceeded,  the  receiver  fix  quality  drops  to  zero.  This  provides  an  automatic  halt  to 
data  acquisition  (after  15  seconds)  and  a  warning  alert  message  to  the  tow  vehicle  operator  to 
wait  for  better  positioning  accuracy. 

•  An  additional  QuickPlan  display  shows  satellite  trajectories  throughout  the  planned  day 
to  further  assist  in  site  investigation  planning  (e.g.  if  a  high  number  of  satellites  lie  to 
the  west  at  low  elevation  during  a  certain  part  of  the  work  day,  they  may  be  blocked  by 
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the  local  buildings).  All  of  this  information  is  used  to  effectively  plan  the  investigation 
workday.  Workday  times  with  unacceptable  PDOP  values  are  used  for  lunch  breaks  or 
other  investigation  tasks,  including  data  transfer,  processing,  analysis,  or  logistics 
resupply. 

A  high  degree  of  QC  is  attained  through  having  trained  personnel  who  know  what 
acceptable  and  unacceptable  data  values  operate  the  system.  All  values  are  displayed  once  per 
second  for  operator  observation.  Total  field  cesium  vapor  magnetometers  are  used  for  their  high 
sensitivity  (0.01  gammas)  and  high  dynamic  range  (20,000  to  95,000  gamma).  Magnetic  field 
strengths  outside  this  dynamic  range  result  in  a  0  output  that  is  monitored  by  the  data  acquisition 
software.  These  sensors  also  have  active  and  dead  zones  that  interact  with  the  local  field 
direction.  Both  the  sensor  alignment/misalignment  and  sensor  out  of  range  are  constantly 
monitored  by  the  data  acquisition  software,  and  the  operator  is  alerted  to  these  error  conditions. 
As  delivered  from  the  sensor  manufacturer,  these  sensors  either  work  or  they  don’t  work.  Other 
than  replacing  failed  sensors/cables,  there  are  no  operator  calibration  adjustments  that  can  be 
made  to  the  magnetometer  array.  There  may  be  sensor-to-sensor  offsets  that  are  fixed  or 
directionally  sensitive  which  can  be  adjusted  for  at  the  data  processing  end,  if  required.  The 
EM61  data  for  all  lower  and  upper  coils  is  displayed  in  real  time.  The  operator  may  adjust  the 
zero  setting  for  each  coil  pair  at  the  EM  electronics  or  via  a  software  background  subtraction. 
The  operator  is  trained  to  observe  the  EM  output  for  baseline  readings,  acceptable  noise  levels, 
drift,  and  sensor  failure.  The  rover  differential  GPS  requires  radio  line  of  sight  to  the  base 
navigation  station  and  access  to  the  local  GPS  satellite  constellation.  The  data  acquisition 
program  monitors  and  assesses  the  navigation  data  quality  for  both  of  these  conditions 
continuously  and  alerts  the  operator  whenever  there  is  a  problem. 

After  a  survey  is  complete  and  the  data  transferred,  a  separate  program  examines  and 
reports  on  the  navigation  and  sensor  quality.  The  results  of  this  report  are  typically  manually 
entered  onto  the  data  processing  and  archiving  log  sheet. 

The  data  processing  end  of  STOLS  is  the  largest  measure  of  QC  and  assessment.  At  the 
workstation,  raw  data  is  archived,  the  navigation  data  is  corrected  for  any  jumps,  and  the 
0.5  meter  by  75  or  10  Hz  sensor  data  is  interpolated  to  a  10  cm  grid  for  display.  The  visual 
quality  of  this  image  is  the  best  indicator  of  system  quality  and  can  be  scaled  to  optimally  display 
individual  magnetic  or  EM  anomalies.  Once  this  image  is  made,  site-specific  landmarks  from 
each  survey  may  be  overlaid. 

Target  coordinates  should  overlay  an  anomaly  in  the  image  for  visual  correlation.  This 
may  also  be  done  for  the  base  navigation  station  location(s).  Additionally,  anomalies  can  be 
analyzed  and  their  coordinates  determined  and  compared  with  ground  truth.  Both  techniques 
may  be  used. 

Multi-sensor  STOLS  will  be  field-tested  daily  to  ensure  it  is  operating  properly.  If  the 
standard  response  cannot  be  attained,  the  system  will  be  repaired,  or  components  replaced. 

Failed  or  failing  equipment  will  be  replaced.  Problems  associated  with  low  battery  voltage 
(e.g.  sensor  drift)  will  require  battery  charging  and  possible  resurveying. 
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QA  procedures  mandated  by  the  Corps  of  Engineers  will  also  be  employed.  These  will 
include  a  daily  static  check  and  daily  object  spike  test. 

2.1.6  Additional  Records 


The  following  record(s)  by  this  vendor  can  be  accessed  via  the  Internet  as  Microsoft  Word 
documents  at  www.uxotestsites.org.  The  Blind  Grid  counterpart  to  this  report  is  Scoring  Record 
No.  293. 

2.2  YPG  SITE  INFORMATION 

2.2.1  Location 

YPG  is  located  adjacent  to  the  Colorado  River  in  the  Sonoran  Desert.  The  UXO  Standardized 
Test  Site  is  located  south  of  Pole  Line  Road  and  east  of  the  Countermine  Testing  and  Training 
Range.  The  Open  Field  range,  Calibration  Grid,  Blind  Grid,  Mogul  area,  and  Desert  Extreme 
area  comprise  the  350  by  500-meter  general  test  site  area.  The  open  field  site  is  the  largest  of  the 
test  sites  and  measures  approximately  200  by  350  meters.  To  the  east  of  the  open  field  range  are 
the  calibration  and  blind  test  grids  that  measure  30  by  40  meters  and  40  by  40  meters, 
respectively.  South  of  the  Open  Field  is  the  135-  by  80-meter  Mogul  area  consisting  of  a 
sequence  of  man-made  depressions.  The  Desert  Extreme  area  is  located  southeast  of  the  open 
field  site  and  has  dimensions  of  50  by  100  meters.  The  Desert  Extreme  area,  covered  with 
desert-type  vegetation,  is  used  to  test  the  performance  of  different  sensor  platforms  in  a  more 
severe  desert  conditions/environment. 

2.2.2  Soil  Type 

Soil  samples  were  collected  at  the  YPG  UXO  Standardized  Test  Site  by  ERDC  to 
characterize  the  shallow  subsurface  (<  3  m).  Both  surface  grab  samples  and  continuous  soil 
borings  were  acquired.  The  soils  were  subjected  to  several  laboratory  analyses,  including 
sieve/hydrometer,  water  content,  magnetic  susceptibility,  dielectric  permittivity,  X-ray 
diffraction,  and  visual  description. 

There  are  two  soil  complexes  present  within  the  site,  Riverbend-Carrizo  and 
Cristobal-Gunsight.  The  Riverbend-Carrizo  complex  is  comprised  of  mixed  stream  alluvium, 
whereas  the  Cristobal-Gunsight  complex  is  derived  from  fan  alluvium.  The  Cristobal-Gunsight 
complex  covers  the  majority  of  the  site.  Most  of  the  soil  samples  were  classified  as  either  a 
sandy  loam  or  loamy  sand,  with  most  samples  containing  gravel-size  particles.  All  samples  had 
a  measured  water  content  less  than  7  percent,  except  for  two  that  contained  11 -percent  moisture. 
The  majority  of  soil  samples  had  water  content  between  1  to  2  percent.  Samples  containing 
more  than  3  percent  were  generally  deeper  than  1  meter. 

An  X-ray  diffraction  analysis  on  four  soil  samples  indicated  a  basic  mineralogy  of  quartz, 
calcite,  mica,  feldspar,  magnetite,  and  some  clay.  The  presence  of  magnetite  imparted 
a  moderate  magnetic  susceptibility,  with  volume  susceptibilities  generally  greater  than 
100  by  10-5  SI. 
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For  more  details  concerning  the  soil  properties  at  the  YPG  test  site,  go  to 
www.uxotestsites.org  on  the  web  to  view  the  entire  soils  description  report. 

2.2.3  Test  Areas 


A  description  of  the  test  site  areas  at  YPG  is  included  in  Table  2. 


TABLE  2.  TEST  SITE  AREAS 


Area 

Description 

Calibration  Grid 

Contains  the  15  standard  ordnance  items  buried  in  six  positions  at 
various  angles  and  depths  to  allow  demonstrator  equipment 
calibration. 

Blind  Grid 

Contains  400  grid  cells  in  a  0.16-hectare  (0.39-acre)  site.  The  center 
of  each  grid  cell  contains  ordnance,  clutter,  or  nothing. 

Open  Field 

A  4-hectare  (10-acre)  site  containing  open  areas,  dips,  ruts,  and 
obstructions,  including  vegetation. 
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SECTION  3.  FIELD  DATA 


3.1  DATE  OF  FIELD  ACTIVITIES  (18  through  20  October  2004) 

3.2  AREAS  TESTED/NUMBER  OF  HOURS 


Areas  tested  and  total  number  of  hours  operated  at  each  site  are  summarized  in  Table  3. 


TABLE  3.  AREAS  TESTED  AND 
NUMBER  OF  HOURS 


Area 

Number  of  Hours 

Calibration  Lanes 

0.43 

Open  Field 

15.32 

3.3  TEST  CONDITIONS 
3.3.1  Weather  Conditions 


A  YPG  weather  station  located  approximately  one  mile  west  of  the  test  site  was  used  to 
record  average  temperature  and  precipitation  on  a  half  hour  basis  for  each  day  of  operation.  The 
temperatures  listed  in  Table  4  represent  the  average  temperature  during  field  operations  from 
0700  to  1700  hours  while  precipitation  data  represents  a  daily  total  amount  of  rainfall.  Hourly 
weather  logs  used  to  generate  this  summary  are  provided  in  Appendix  B. 


TABLE  4.  TEMPERATURE/PRECIPITATION  DATA  SUMMARY 


Date,  2004 

Average  Temperature,  °F 

Total  Daily  Precipitation,  in. 

18  October 

75.90 

0.00 

19  October 

74.93 

0.00 

20  October 

76.50 

0.00 

3.3.2  Field  Conditions 


GEO-CENTER  surveyed  the  Open  Field  area  from  18  through  20  October  2004.  The 
Open  Field  was  dry  and  the  weather  warm  throughout. 

3.3.3  Soil  Moisture 


Three  soil  probes  were  placed  at  various  locations  within  the  site  to  capture  soil  moisture 
data:  Blind  Grid,  Calibration,  Mogul,  and  Wooded  areas.  Measurements  were  collected  in 
percent  moisture  and  were  taken  twice  daily  (morning  and  afternoon)  from  five  different  soil 
depths  (1  to  6  in.,  6  to  12  in.,  12  to  24  in.,  24  to  36  in.,  and  36  to  48  in.)  from  each  probe.  Soil 
moisture  logs  are  included  in  Appendix  C. 


13 


3.4  FIELD  ACTIVITIES 


3.4.1  Setup/Mobil  ization 

These  activities  included  initial  mobilization  and  daily  equipment  preparation  and  break 
down.  A  crew  of  2  people  took  2  hours  and  54  minutes  to  perform  the  initial  setup  and 
mobilization.  An  additional  2  hours  and  49  minutes  of  daily  equipment  preparation  and 
20  minutes  of  equipment  breakdown  took  place  in  the  Open  Field. 

3.4.2  Calibration 


GEO-CENTER  worked  in  the  Calibration  Lane  on  18  October  for  26  minutes,  all  of  which 
was  spent  collecting  data.  No  other  calibration  activities  occurred  while  surveying  the  Open 
Field. 


3.4.3  Downtime  Occasions 


Occasions  of  downtime  are  grouped  into  five  categories:  equipment/data  checks  or 
equipment  maintenance,  equipment  failure  and  repair,  weather,  Demonstration  Site  issues,  or 
breaks/lunch.  All  downtime  is  included  for  the  purposes  of  calculating  labor  costs  (section  5) 
except  for  downtime  due  to  Demonstration  Site  issues.  Demonstration  Site  issues,  while  noted  in 
the  Daily  Log,  are  considered  nonchargeable  downtime  for  the  purposes  of  calculating  labor 
costs  and  are  not  discussed.  Breaks  and  lunches  are  discussed  in  this  section  and  billed  to  the 
total  Site  Survey  area. 

3.4.3.1  Equipment/data  checks,  maintenance.  Equipment/data  checks  and  maintenance 
activities  accounted  for  3  hours  and  37  minutes  in  the  Open  Field.  GEO-CENTER  also  spent  1 
hour  and  17  minutes  on  breaks  and  lunches. 

3.4.3.2  Equipment  failure  or  repair.  One  equipment  failure  occurred  in  the  Open  Field 
survey.  GEO-CENTER  had  a  bad  GPS  satellite  quality  for  5  minutes  on  19  October.  The 
situation  rectified  itself  and  no  other  problems  occurred. 

3.4.3.3  Weather.  No  weather  delays  occurred  during  the  survey. 

3.4.4  Data  Collection 


GEO-CENTERS  spent  a  total  of  15  hours  and  19  minutes  in  the  Open  Field,  of  which 
7  hours  and  1 1  minutes  was  spent  collecting  data  in  the  Open  Field. 

3.4.5  Demobilization 


The  GEO-CENTERS  survey  crew  went  on  to  conduct  a  full  demonstration  of  the  site. 
Therefore,  demobilization  did  not  occur  until  20  October  2004.  On  that  day,  it  took  the  crew 
1  hour  and  53  minutes  to  break  down  and  pack  up  their  equipment. 
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3.5  PROCESSING  TIME 


GEO-CENTERS  submitted  the  raw  data  from  the  demonstration  activities  on  the  last  day 
of  the  demonstration,  as  required.  The  scoring  submittal  data  was  also  provided  within  the 
required  30-day  timeframe. 

3.6  DEMONSTRATOR’S  FIELD  PERSONNEL 

Robert  Siegel,  GEO-CENTERS,  Project  Manager  and  Data  Analyst 
David  Fanning,  under  contract  to  GEO-CENTERS,  truck  driver 

Alan  Crandall,  under  contract  to  GEO-CENTERS,  U.S.  Environmental,  Field  Supervisor 

3.7  DEMONSTRATOR’S  FIELD  SURVEYING  METHOD 

GEO-CENTERS  surveyed  the  Open  Field  linear  fashion,  west  to  east. 

3.8  SUMMARY  OF  DAILY  LOGS 

Daily  logs  capture  all  field  activities  during  this  demonstration  and  are  located  in 
Appendix  D.  Activities  pertinent  to  this  specific  demonstration  are  indicated  in  highlighted  text. 
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SECTION  4.  TECHNICAL  PERFORMANCE  RESULTS 


4.1  ROC  CURVES  USING  ALL  ORDNANCE  CATEGORIES 

Figure  2,  4,  and  6  shows  the  probability  of  detection  for  the  response  stage  (Pdres)  and  the 
discrimination  stage  (Pddisc)  versus  their  respective  probability  of  false  positive  for  the  EM 
sensor(s),  MAG  sensor(s)  and  combined  EM/MAG  picks  respectively.  Figure  3,  5,  and  7  shows 
both  probabilities  plotted  against  their  respective  background  alarm  rate.  Both  figures  use 
horizontal  lines  to  illustrate  the  performance  of  the  demonstrator  at  two  demonstrator-specified 
points:  at  the  system  noise  level  for  the  response  stage,  representing  the  point  below  which 
targets  are  not  considered  detectable,  and  at  the  demonstrator’s  recommended  threshold  level  for 
the  discrimination  stage,  defining  the  subset  of  targets  the  demonstrator  would  recommend 
digging  based  on  discrimination.  Note  that  all  points  have  been  rounded  to  protect  the  ground 
truth. 


The  overall  ground  truth  is  composed  of  ferrous  and  non-ferrous  anomalies.  Due  to 
limitations  of  the  magnetometer,  the  non-ferrous  items  cannot  be  detected.  Therefore,  the  ROC 
curves  presented  in  figures  4  and  5  of  this  section  are  based  on  the  subset  of  the  ground  truth  that 
is  solely  made  up  of  ferrous  anomalies. 


—  Threshold 
Response 
Discrimination 


Figure  2.  EM  Sensor  open  field  probability  of  detection  for  response  and  discrimination  stages  versus 
their  respective  probability  of  false  positive  over  all  ordnance  categories  combined. 
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—  Threshold 
Response 

—  Discrimination 


Figure  3.  EM  Sensor  open  field  probability  of  detection  for  response  and  discrimination  stages  versus 
their  respective  background  alarm  rate  over  all  ordnance  categories  combined. 
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Figure  4.  MAG  Sensor  open  field  probability  of  detection  for  response  and  discrimination  stages  versus 
their  respective  probability  of  false  positive  over  all  ordnance  categories  combined. 
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—  Discrimination 


Figure  5.  MAG  Sensor  open  field  probability  of  detection  for  response  and  discrimination  stages  versus 
their  respective  background  alarm  rate  over  all  ordnance  categories  combined. 
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Figure  6.  Combined  Sensor  open  field  probability  of  detection  for  response  and  discrimination  stages 
versus  their  respective  probability  of  false  positive  over  all  ordnance  categories  combined. 
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Figure  7.  Combined  Sensor  open  field  probability  of  detection  for  response  and  discrimination  stages 
versus  their  respective  background  alarm  rate  over  all  ordnance  categories  combined. 


4.2  ROC  CURVES  USING  ORDNANCE  LARGER  THAN  20  MM 

Figure  8,  10,  and  12  shows  the  probability  of  detection  for  the  response  stage  (Pd"*)  and 
the  discrimination  stage  (Pddlsc)  versus  their  respective  probability  of  false  positive  when  only 
targets  larger  than  20  mm  are  scored  for  the  EM  sensor(s),  MAG  sensor(s)  and  Combined 
EM/MAG  picks  respectively.  Figure  9,  11,  and  13  shows  both  probabilities  plotted  against  their 
respective  probability  of  background  alarm.  Both  figures  use  horizontal  lines  to  illustrate  the 
performance  of  the  demonstrator  at  two  demonstrator-specified  points:  at  the  system  noise  level 
for  the  response  stage,  representing  the  point  below  which  targets  are  not  considered  detectable, 
and  at  the  demonstrator’s  recommended  threshold  level  for  the  discrimination  stage,  defining  the 
subset  of  targets  the  demonstrator  would  recommend  digging  based  on  discrimination.  Note  that 
all  points  have  been  rounded  to  protect  the  ground  truth. 

The  overall  ground  truth  is  composed  of  ferrous  and  non-ferrous  anomalies.  Due  to 
limitations  of  the  magnetometer,  the  non-ferrous  items  cannot  be  detected.  Therefore,  the  ROC 
curves  presented  in  figures  10  and  11  of  this  section  are  based  on  the  subset  of  the  ground  truth 
that  is  solely  made  up  of  ferrous  anomalies. 
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Figure  8.  EM  Sensor  open  field  probability  of  detection  for  response  and  discrimination  stages  versus 
their  respective  probability  of  false  positive  for  all  ordnance  larger  than  20  mm. 
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Figure  9.  EM  Sensor  open  field  probability  of  detection  for  response  and  discrimination  stages  versus 
their  respective  background  alarm  rate  for  all  ordnance  larger  than  20  mm. 
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Figure  10.  MAG  Sensor  open  field  probability  of  detection  for  response  and  discrimination  stages  versus 
their  respective  probability  of  false  positive  for  all  ordnance  larger  than  20  mm. 
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Figure  11.  MAG  Sensor  open  field  probability  of  detection  for  response  and  discrimination  stages  versus 
their  respective  background  alarm  rate  for  all  ordnance  larger  than  20  mm. 
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Figure  12.  Combined  Sensor  open  field  probability  of  detection  for  response  and  discrimination  stages 
versus  their  respective  probability  of  false  positive  for  all  ordnance  larger  than  20  mm. 
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Figure  13.  Combined  Sensor  open  field  probability  of  detection  for  response  and  discrimination  stages 
versus  their  respective  background  alarm  rate  for  all  ordnance  larger  than  20  mm. 
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4.3  PERFORMANCE  SUMMARIES 


Results  for  the  Open  Field  test  broken  out  by  sensor  type,  size,  depth  and  nonstandard 
ordnance  are  presented  in  Tables  5a,  b,  and  c  (for  cost  results,  see  section  5).  Results  by  size  and 
depth  include  both  standard  and  nonstandard  ordnance.  The  results  by  size  show  how  well  the 
demonstrator  did  at  detecting/discriminating  ordnance  of  a  certain  caliber  range  (see  app  A  for  size 
definitions).  The  results  are  relative  to  the  number  of  ordnance  items  emplaced.  Depth  is  measured 
from  the  geometric  center  of  anomalies. 

The  RESPONSE  STAGE  results  are  derived  from  the  list  of  anomalies  above  the 
demonstrator-provided  noise  level.  The  results  for  the  DISCRIMlNATtON  STAGE  are  derived 
from  the  demonstrator’s  recommended  threshold  for  optimizing  UXO  field  cleanup  by  minimizing 
false  digs  and  maximizing  ordnance  recovery.  The  lower  90-percent  confidence  limit  on  probability 
of  detection  and  Pfp  was  calculated  assuming  that  the  number  of  detections  and  false  positives  are 
binomially  distributed  random  variables.  All  results  in  Table  5  have  been  rounded  to  protect  the 
ground  truth.  However,  lower  confidence  limits  were  calculated  using  actual  results. 

The  overall  ground  truth  is  composed  of  ferrous  and  non-ferrous  anomalies.  Due  to  limitations 
of  the  magnetometer,  the  non-ferrous  items  cannot  be  detected.  Therefore,  the  summary  presented  in 
Table  5b  is  split  exhibiting  results  based  on  the  subset  of  the  ground  truth  that  is  solely  the  ferrous 
anomalies  and  the  full  ground  truth  for  comparison  purposes. 

All  other  tables  presented  in  this  section  are  based  on  scoring  against  the  ferrous  only  ground 
truth.  The  response  stage  noise  level  and  recommended  discrimination  stage  threshold  values  are 
provided  by  the  demonstrator. 


TABLE  5a.  SUMMARY  OF  OPEN  FIELD  RESULTS  FOR  THE 
STOLS/TOWED  ARRAY  (EM  SENSOR) 


Metric 

Overall 

Standard 

Nonstandard 

By  Size 

By  Depth,  m 

Small  Medium  Large 

<  0.3  |  0.3  to  <1  |  >=  I 

RESPONSE  STAGE 

Pd 

0,70 

0.70 

0.75 

0.65 

0.75 

0.90 

0.70 

0.75 

0.60 

Pd  Low  90%  Conf 

0.69 

0.66 

0.70 

0.59 

0.70 

0.83 

0.68 

0.70 

0.47 

Pd  Upper  90%  Conf 

0.74 

0.73 

0.78 

0.67 

0.80 

0,93 

0.75 

0.79 

0.69 

Prp 

0.70 

- 

- 

- 

- 

- 

0.65 

0.75 

0.40 

Pfp  Low  90%  Conf 

0.66 

- 

- 

- 

- 

- 

0.63 

0.73 

0.19 

Pfp  Upper  90%  Conf 

0.70 

- 

- 

- 

- 

- 

0.67 

0.79 

0.65 

BAR 

0.00 

- 

- 

- 

- 

- 

- 

- 

- 

DISCRIMINATION  STAGE 

Pd 

0.40 

0.40 

0.35 

0.05 

0.70 

0.85 

0.25 

0.60 

0.55 

Pd  Low  90%  Conf 

0.36 

0.37 

0.33 

0.05 

0.65 

0.81 

0.21 

0.56 

0.45 

Pd  Upper  90%  Conf 

0.42 

0.44 

0.42 

0.09 

0.75 

0.91 

0.28 

0.66 

0.67 

P<T 

0.45 

- 

- 

- 

- 

- 

0.40 

0.65 

0.40 

Pfp  Low  90%  Conf 

0.44 

- 

- 

- 

- 

- 

0.36 

0.64 

0.19 

Pfp  Upper  90%  Conf 

0.48 

- 

- 

- 

- 

- 

0.40 

0.71 

0.65 

BAR 

0.00 

- 

- 

- 

- 

- 

- 

- 

- 

Response  Stage  Noise  Level:  -0.90 
Recommended  Discrimination  Stage  Threshold:  4.00 
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TABLE  5b.  SUMMARY  OF  OPEN  FIELD  RESULTS  FOR  THE 
STOLS/TOWED  ARRAY  (MAG  SENSOR) 


Ferrous  Only  Ground  Truth 

Metric 

Overall 

Standard 

Nonstandard 

By  Size 

By  Depth,  m 

Small 

Medium 

Large 

<0.3 

0.3  to  <1 

>=  1 

RESPONSE  STAGE 

Pd 

0.40 

0.40 

0.40 

0.15 

0.50 

0.80 

0.35 

0.60 

0.30 

Pd  Low  90%  Conf 

0.39 

0.38 

0.37 

0.14 

0.45 

0.76 

0.29 

0.56 

0.20 

Pd  Upper  90%  Conf 

0.45 

0.47 

0.46 

0.21 

0.57 

0.87 

0.37 

0.67 

0.41 

Prp 

0.50 

- 

- 

- 

- 

- 

0.45 

0.60 

0.10 

Prp  Low  90%  Conf 

0.48 

- 

- 

- 

- 

- 

0.44 

0.58 

0.01 

PfpUpper  90%  Conf 

0.52 

- 

- 

- 

- 

- 

0.48 

0.65 

0.34 

BAR 

0.05 

- 

- 

- 

- 

- 

- 

- 

- 

DISCRIMINATION  STAGE 

Pd 

0.40 

0.40 

0.35 

0.10 

0.50 

0.80 

0.30 

0.60 

0.30 

Pd  Low  90%  Conf 

0.36 

0.37 

0.32 

0.09 

0.44 

0.75 

0.24 

0.55 

0.18 

Pd  Upper  90%  Conf 

0.42 

0.46 

0.41 

0.16 

0.55 

0.86 

0.32 

0.67 

0.39 

Pfp 

0.50 

- 

- 

- 

- 

- 

0.45 

0.60 

0.00 

Pfp  Low  90%  Conf 

0.47 

- 

- 

- 

- 

- 

0.42 

0.58 

0.00 

PfpUpper  90%  Conf 

0.50 

- 

- 

- 

- 

- 

0.46 

0.65 

0.21 

BAR 

0.05 

- 

- 

- 

- 

- 

- 

- 

- 

Full  Ground  Truth 

Metric 

Overall 

Standard 

Nonstandard 

By  Size 

By  Depth,  m 

Small 

Medium 

Large 

<0.3 

0.3  to  <1 

>=1 

RESPONSE  STAGE 

Pd 

0.35 

0.35 

0.40 

0.10 

0.50 

0.80 

0.25 

0.55 

0.30 

Pd  Low  90%  Conf 

0.33 

0.30 

0.35 

0.10 

0.45 

0.76 

0.24 

0.48 

0.20 

Pd  Upper  90%  Conf 

0.39 

0.37 

0.44 

0.15 

0.57 

0.87 

0.31 

0.59 

0.40 

P?E 

0.50 

- 

- 

- 

- 

- 

0.45 

0.60 

0.10 

Pfp  Low  90%  Conf 

0.48 

- 

- 

- 

- 

- 

0.44 

0.58 

0.01 

PfpUpper  90%  Conf 

0.52 

- 

- 

- 

- 

- 

0.48 

0.65 

0.34 

BAR 

0.05 

- 

- 

- 

- 

- 

- 

- 

- 

DISCRIMINATION  STAGE 

Pd 

0.35 

0.30 

0.35 

0.10 

0.50 

0.80 

0.25 

0.55 

0.25 

Pd  Low  90%  Conf 

0.31 

0.29 

0.30 

0.07 

0.44 

0.75 

0.20 

0.48 

0.18 

Pd  Upper  90%  Conf 

0.36 

0.36 

0.39 

0.11 

0.55 

0.86 

0.27 

0.58 

0.38 

Pfp 

0.50 

- 

- 

- 

- 

- 

0.45 

0.60 

0.00 

Pfp  Low  90%  Conf 

0.47 

- 

- 

- 

- 

- 

0.42 

0.58 

0.00 

Pfp  Upper  90%  Conf 

0.50 

- 

- 

- 

- 

- 

0.46 

0.65 

0.21 

BAR 

0.05 

- 

- 

- 

- 

- 

- 

- 

- 

Response  Stage  Noise  Level:  -0.90 
Recommended  Discrimination  Stage  Threshold:  0.04 
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TABLE  5c.  SUMMARY  OF  OPEN  FIELD  RESULTS  FOR  THE 
STOLS/TOWED  ARRAY  (COMBINED  EM/MAG  RESULTS) 


Metric 

Overall 

Standard 

Nonstandard 

By  Size 

By  Depth,  m 

Small 

Medium 

Large 

<0.3 

0.3  to  <1 

>=  1 

RESPONSE  STAGE  | 

Pd 

0.70 

0.70 

0.75 

0.65 

0.75 

0.90 

0.75 

0.75 

0.60 

Pd  Low  90%  Conf 

0.69 

0.67 

0.70 

0.61 

0.71 

0.83 

0.69 

0.70 

0.47 

Pd  Upper  90%  Conf 

0.75 

0.74 

0.79 

0.68 

0.81 

0.93 

0.76 

0.79 

0.69 

Pfp 

0.70 

- 

- 

- 

- 

- 

0.65 

0.78 

0.40 

Pfp  Low  90%  Conf 

0.68 

- 

- 

- 

- 

- 

0.65 

0.75 

0.19 

Pfp  Upper  90%  Conf 

0.72 

- 

- 

- 

- 

- 

0.69 

0.81 

0.65 

BAR 

0.05 

- 

- 

- 

- 

- 

- 

- 

- 

DISCRIMINATION  STAGE 

Pd 

0.45 

0.40 

0.45 

0.15 

0.70 

0.90 

0.30 

0.60 

0.60 

Pd  Low  90%  Conf 

0.40 

0.39 

0.38 

0.11 

0.63 

0.83 

0.27 

0.56 

0.47 

Pd  Upper  90%  Conf 

0.45 

0.46 

0.47 

0.16 

0.74 

0.93 

0.34 

0.66 

0.69 

p* 

0.55 

- 

- 

- 

- 

- 

0.50 

0.75 

0.40 

Pfp  Low  90%  Conf 

0.54 

- 

- 

- 

- 

- 

0.47 

0.72 

0.19 

Pfp  Upper  90%  Conf 

0.58 

- 

- 

- 

- 

- 

0.51 

0.79 

0.65 

BAR 

0.05 

- 

- 

- 

- 

■ 

- 

- 

- 

Response  Stage  Noise  Level:  0.16 
Recommended  Discrimination  Stage  Threshold:  0.60 

Note:  The  recommended  discrimination  stage  threshold  values  are  provided  by  the  demonstrator. 


4.4  EFFICIENCY,  REJECTION  RATES,  AND  TYPE  CLASSIFICATION 
(All  results  based  on  combined  EM/MAG  data  set) 

Efficiency  and  rejection  rates  are  calculated  to  quantify  the  discrimination  ability  at 
specific  points  of  interest  on  the  ROC  curve:  (1)  at  the  point  where  no  decrease  in  Pd  is  suffered 
(i.e.,  the  efficiency  is  by  definition  equal  to  one)  and  (2)  at  the  operator  selected  threshold. 
These  values  are  reported  in  Table  6. 


TABLE  6.  EFFICIENCY  AND  REJECTION  RATES 


Efficiency  (E) 

False  Positive 
Rejection  Rate 

Background  Alarm 
Rejection  Rate 

At  Operating  Point 

0.59 

0.20 

0.32 

With  No  Loss  ofPd 

1.00 

0.00 

0.00 
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At  the  demonstrator’s  recommended  setting,  the  ordnance  items  that  were  detected  and 
correctly  discriminated  were  further  scored  on  whether  their  correct  type  could  be  identified 
(table  7).  Correct  type  examples  include  “20-mm  projectile,  105-mm  HEAT  Projectile,  and 
2.75-inch  Rocket”.  A  list  of  the  standard  type  declaration  required  for  each  ordnance  item  was 
provided  to  demonstrators  prior  to  testing.  For  example,  the  standard  type  for  the  three  example 
items  are  20mmP,  105H,  and  2.75in,  respectively. 


TABLE  7.  CORRECT  TYPE  CLASSIFICATION 
OF  TARGETS  CORRECTLY 
DISCRIMINATED  AS  UXO 


Size 

Percentage  Correct 

Small 

NA 

Medium 

NA 

Large 

NA 

Overall 

NA 

4.5  LOCATION  ACCURACY 

The  mean  location  error  and  standard  deviations  appear  in  Table  8.  These  calculations  are 
based  on  average  missed  depth  for  ordnance  correctly  identified  in  the  discrimination  stage. 
Depths  are  measured  from  the  closest  point  of  the  ordnance  to  the  surface.  For  the  Blind  Grid, 
only  depth  errors  are  calculated,  since  (X,  Y)  positions  are  known  to  be  the  centers  of  each  grid 
square. 


TABLE  8.  MEAN  LOCATION  ERROR  AND 
STANDARD  DEVIATION  (M) 


Mean 

Standard  Deviation 

Northing 

-0.01 

0.17 

Easting 

0.00 

0.15 

Depth 

0.10 

0.17 

27 


(Page  28  Blank) 


SECTION  5.  ON-SITE  LABOR  COSTS 


A  standardized  estimate  for  labor  costs  associated  with  this  effort  was  calculated  as 
follows:  the  first  person  at  the  test  site  was  designated  “supervisor”,  the  second  person  was 
designated  “data  analyst”,  and  the  third  and  following  personnel  were  considered  “field  support”. 
Standardized  hourly  labor  rates  were  charged  by  title:  supervisor  at  $95.00/hour,  data  analyst  at 
$57.00/hour,  and  field  support  at  $28. 50/hour. 

Government  representatives  monitored  on-site  activity.  All  on-site  activities  were 
grouped  into  one  of  ten  categories:  initial  setup/mobilization,  daily  setup/stop,  calibration, 
collecting  data,  downtime  due  to  break/lunch,  downtime  due  to  equipment  failure,  downtime  due 
to  equipment/data  checks  or  maintenance,  downtime  due  to  weather,  downtime  due  to 
demonstration  site  issue,  or  demobilization.  See  Appendix  D  for  the  daily  activity  log.  See 
section  3.4  for  a  summary  of  field  activities. 

The  standardized  cost  estimate  associated  with  the  labor  needed  to  perform  the  field 
activities  is  presented  in  Table  9.  Note  that  calibration  time  includes  time  spent  in  the 
Calibration  Lanes  as  well  as  field  calibrations.  “Site  survey  time”  includes  daily  setup/stop  time, 
collecting  data,  breaks/lunch,  downtime  due  to  equipment/data  checks  or  maintenance,  downtime 
due  to  failure,  and  downtime  due  to  weather. 


TABLE  9.  ON-SITE  LABOR  COSTS 


No.  People 

Hourly  Wage 

Hours 

Cost 

INITIAL  SETUP 

Supervisor 

1 

$95.00 

2.90 

$275.50 

Data  Analyst 

1 

57.00 

2.90 

165.30 

Field  Support 

0 

28.50 

2.90 

0.00 

Subtotal 

$440.80 

CALIBRATION 

Supervisor 

1 

$95.00 

0.43 

$40.85 

Data  Analyst 

1 

57.00 

0.43 

24.51 

Field  Support 

28.50 

0.43 

0.00 

Subtotal 

$65.36 

SITE  SURVEY 

Supervisor 

1 

$95.00 

15.32 

$1,455.40 

Data  Analyst 

1 

57.00 

15.32 

873.24 

Field  Support 

0 

28.50 

15.32 

0.00 

Subtotal 

2,328.64 

See  notes  at  end  of  table. 
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TABLE  9  (CONT’D) 


No.  People 

Hourly  Wage 

Hours 

Cost 

DEMOBILIZATION 

Supervisor 

1 

$95.00 

1.88 

$178.60 

Data  Analyst 

1 

57.00 

1.88 

107.16 

Field  Support 

0 

28.50 

1.88 

0.00 

Subtotal 

$285.76 

Total 

$3,120.56 

Notes:  Calibration  time  includes  time  spent  in  the  Calibration  Lanes  as  well  as  calibration 
before  each  data  run. 

Site  Survey  time  includes  daily  setup/stop  time,  collecting  data,  breaks/lunch,  downtime 
due  to  system  maintenance,  failure,  and  weather. 
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SECTION  6.  COMPARISON  OF  RESULTS  TO  BLIND  GRID  DEMONSTRATION 

(BASED  COMBINATION  EM/MAD  DATA  SETSI 


6.1  SUMMARY  OF  RESULTS  FROM  BLIND  GRID  DEMONSTRATION 

Table  10  shows  the  results  from  the  Blind  Grid  survey  conducted  prior  to  surveying  the 
Open  Field  during  the  same  site  visit  in  October  of  2004.  Due  to  the  system  utilizing 
magnetometer  type  sensors,  all  results  presented  in  the  following  section  have  been  based  on 
performance  scoring  against  the  ferrous  only  ground  truth  anomalies.  For  more  details  on  the 
Blind  Grid  survey  results  reference  section  2.1.6. 


TABLE  10.  SUMMARY  OF  BLIND  GRID  RESULTS  FOR  THE 
STOLS/TOWED  ARRAY 


Metric 

Overall 

Standard 

Nonstandard 

By  Size 

By  Depth,  m 

Small 

Medium 

Large 

<0.3 

0.3  to  <1 

>=  I 

RESPONSE  STAGE 

Pd 

1.00 

1.00 

1.00 

0.95 

1.00 

1.00 

1.00 

0,95 

LOO 

Pd  Low  90%  Conf 

0.95 

0.92 

0.92 

0.90 

0.90 

0.85 

0.95 

0.85 

0.72 

Pd  Upper  90%  Conf 

LOO 

LOO 

LOO 

1.00 

LOO 

LOO 

LOO 

1.00 

LOO 

fV„ 

LOO 

- 

- 

- 

- 

- 

LOO 

LOO 
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6.2  COMPARISON  OF  ROC  CURVES  USING  ALL  ORDNANCE  CATEGORIES 

Figure  6  shows  Pdrei>  versus  the  respective  Pfp  over  all  ordnance  categories.  Figure  7  shows 
Pddisc  versus  their  respective  Pfp  over  all  ordnance  categories.  Figure  7  uses  horizontal  lines  to 
illustrate  the  performance  of  the  demonstrator  at  the  recommended  discrimination  threshold 
levels,  defining  the  subset  of  targets  the  demonstrator  would  recommend  digging  based  on 
discrimination.  The  ROC  curves  in  this  section  are  a  sole  reflection  of  the  ferrous  only  survey. 
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Blind  Grid  293  Noise  Level 
Blind  Grid  293 

_ Open  Field  299 _ 


Figure  6.  STOLS/towed  array  dual  mode  Pdres  stages  versus  the  respective  Pfp  over  all 
ordnance  categories  combined. 


. Blind  Grid  293  Threshold 

— •  Blind  Grid  293 
Open  Field  299 

—  —  Open  Field  299  Threshold 


Figure  7.  STOLS/towed  array  dual  mode  Pddisc  versus  the  respective  Pfp  over  all  ordnance 
categories  combined. 
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6.3  COMPARISON  OF  ROC  CURVES  USING  ORDNANCE  LARGER  THAN  20  MM 


Figure  8  shows  the  Pdres  versus  the  respective  probability  of  Pfp  over  ordnance  larger  than 
20  mm.  Figure  9  shows  Pddisc  versus  the  respective  Pfp  over  ordnance  larger  than  20  mm. 
Figure  9  uses  horizontal  lines  to  illustrate  the  performance  of  the  demonstrator  at  the 
recommended  discrimination  threshold  levels,  defining  the  subset  of  targets  the  demonstrator 
would  recommend  digging  based  on  discrimination. 


Blind  Grid  293  Noise  Level 

Blind  Grid  293 

Open  Field  299 _ 


Figure  8.  STOLS/towed  array  dual  mode  Pdres  versus  the  respective  Pfp  for  ordnance  larger  than 
20  mm. 
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*  *  *  •  Blind  Grid  293  Threshold 
Blind  Grid  293 
Open  Field  299 

—  —  Open  Field  299  Threshold 


Figure  9.  STOLS/towed  array  dual  mode  Pddl;,c  versus  the  respective  Pfp  for  ordnance  larger  than 
20  mm. 


6.4  STATISTICAL  COMPARISONS 

Statistical  Chi-square  significance  tests  were  used  to  compare  results  between  the  Blind 
Grid  and  Open  Field  scenarios.  The  intent  of  the  comparison  is  to  determine  if  the  feature 
introduced  in  each  scenario  has  a  degrading  effect  on  the  performance  of  the  sensor  system. 
However,  any  modifications  in  the  UXO  sensor  system  during  the  test,  like  changes  in  the 
processing  or  changes  in  the  selection  of  the  operating  threshold,  will  also  contribute  to 
performance  differences. 

The  Chi-square  test  for  comparison  between  ratios  was  used  at  a  significance  level  of 
0.05  to  compare  Blind  Grid  to  Open  Field  with  regard  to  Pdres,  Pdd'sc,  Pfpres  and  Pfp^0,  Efficiency 
and  Rejection  Rate.  These  results  are  presented  in  Table  11.  A  detailed  explanation  and 
example  of  the  Chi-square  application  is  located  in  Appendix  A. 
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TABLE  11.  CHI-SQUARE  RESULTS  -  BLIND  GRID  VERSUS  OPEN  FIELD 


Metric 

Small 

Medium 

Large 

Overall 

Pd'es 

Significant 

Not  Significant 

Not  Significant 

Significant 

p^disc 

Significant 

Significant 

Not  Significant 

Significant 

Vs 

Not  Significant 

Not  Significant 

Not  Significant 

Significant 

- 

- 

- 

Not  Significant 

Efficiency 

- 

Significant 

Rejection  rate 

- 

- 

- 

Significant 
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SECTION  7.  APPENDIXES 


APPENDIX  A.  TERMS  AND  DEFINITIONS 
GENERAL  DEFINITIONS 

Anomaly:  Location  of  a  system  response  deemed  to  warrant  further  investigation  by  the 
demonstrator  for  consideration  as  an  emplaced  ordnance  item. 

Detection:  An  anomaly  location  that  is  within  Rhai0  of  an  emplaced  ordnance  item. 

Emplaced  Ordnance:  An  ordnance  item  buried  by  the  government  at  a  specified  location  in  the 
test  site. 

Emplaced  Clutter:  A  clutter  item  (i.e.,  non-ordnance  item)  buried  by  the  government  at  a 
specified  location  in  the  test  site. 

Rhaio'  A  pre-determined  radius  about  the  periphery  of  an  emplaced  item  (clutter  or  ordnance) 
within  which  a  location  identified  by  the  demonstrator  as  being  of  interest  is  considered  to  be  a 
response  from  that  item.  If  multiple  declarations  lie  within  R^o  of  any  item  (clutter  or 
ordnance),  the  declaration  with  the  highest  signal  output  within  the  R^o  will  be  utilized.  For  the 
purpose  of  this  program,  a  circular  halo  0.5  meters  in  radius  will  be  placed  around  the  center  of 
the  object  for  all  clutter  and  ordnance  items  less  than  0.6  meters  in  length.  When  ordnance  items 
are  longer  than  0.6  meters,  the  halo  becomes  an  ellipse  where  the  minor  axis  remains  1  meter  and 
the  major  axis  is  equal  to  the  length  of  the  ordnance  plus  1  meter. 

Small  Ordnance:  Caliber  of  ordnance  less  than  or  equal  to  40  mm  (includes  20-mm  projectile, 
40-mm  projectile,  submunitions  BLU-26,  BLU-63,  and  M42). 

Medium  Ordnance:  Caliber  of  ordnance  greater  than  40  mm  and  less  than  or  equal  to  81  mm 
(includes  57-mm  projectile,  60-mm  mortar,  2.75  in.  Rocket,  MK1 18  Rockeye,  81-mm  mortar). 

Large  Ordnance:  Caliber  of  ordnance  greater  than  81  mm  (includes  105-mm  HEAT,  105-mm 
projectile,  155-mm  projectile,  500-pound  bomb). 

Shallow:  Items  buried  less  than  0.3  meter  below  ground  surface. 

Medium:  Items  buried  greater  than  or  equal  to  0.3  meter  and  less  than  1  meter  below  ground 
surface. 

Deep:  Items  buried  greater  than  or  equal  to  1  meter  below  ground  surface. 

Response  Stage  Noise  Level:  The  level  that  represents  the  point  below  which  anomalies  are  not 
considered  detectable.  Demonstrators  are  required  to  provide  the  recommended  noise  level  for 
the  Blind  Grid  test  area. 
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Discrimination  Stage  Threshold:  The  demonstrator  selected  threshold  level  that  they  believe 
provides  optimum  performance  of  the  system  by  retaining  all  detectable  ordnance  and  rejecting 
the  maximum  amount  of  clutter.  This  level  defines  the  subset  of  anomalies  the  demonstrator 
would  recommend  digging  based  on  discrimination. 

Binomially  Distributed  Random  Variable:  A  random  variable  of  the  type  which  has  only  two 
possible  outcomes,  say  success  and  failure,  is  repeated  for  n  independent  trials  with  the 
probability  p  of  success  and  the  probability  1-p  of  failure  being  the  same  for  each  trial.  The 
number  of  successes  x  observed  in  the  n  trials  is  an  estimate  of  p  and  is  considered  to  be  a 
binomially  distributed  random  variable. 

RESPONSE  AND  DISCRIMINATION  STAGE  DATA 

The  scoring  of  the  demonstrator’s  performance  is  conducted  in  two  stages.  These  two 
stages  are  termed  the  RESPONSE  STAGE  and  DISCRIMINATION  STAGE.  For  both  stages, 
the  probability  of  detection  (Pd)  and  the  false  alarms  are  reported  as  receiver  operating 
characteristic  (ROC)  curves.  False  alarms  are  divided  into  those  anomalies  that  correspond  to 
emplaced  clutter  items,  measuring  the  probability  of  false  positive  (Pfp)  and  those  that  do  not 
correspond  to  any  known  item,  termed  background  alarms. 

The  RESPONSE  STAGE  scoring  evaluates  the  ability  of  the  system  to  detect  emplaced 
targets  without  regard  to  ability  to  discriminate  ordnance  from  other  anomalies.  For  the 
RESPONSE  STAGE,  the  demonstrator  provides  the  scoring  committee  with  the  location  and 
signal  strength  of  all  anomalies  that  the  demonstrator  has  deemed  sufficient  to  warrant  further 
investigation  and/or  processing  as  potential  emplaced  ordnance  items.  This  list  is  generated  with 
minimal  processing  (e.g.,  this  list  will  include  all  signals  above  the  system  noise  threshold).  As 
such,  it  represents  the  most  inclusive  list  of  anomalies. 

The  DISCRIMINATION  STAGE  evaluates  the  demonstrator’s  ability  to  correctly  identify 
ordnance  as  such,  and  to  reject  clutter.  For  the  same  locations  as  in  the  RESPONSE  STAGE 
anomaly  list,  the  DISCRIMINATION  STAGE  list  contains  the  output  of  the  algorithms  applied 
in  the  discrimination-stage  processing.  This  list  is  prioritized  based  on  the  demonstrator’s 
determination  that  an  anomaly  location  is  likely  to  contain  ordnance.  Thus,  higher  output  values 
are  indicative  of  higher  confidence  that  an  ordnance  item  is  present  at  the  specified  location.  For 
electronic  signal  processing,  priority  ranking  is  based  on  algorithm  output.  For  other  systems, 
priority  ranking  is  based  on  human  judgment.  The  demonstrator  also  selects  the  threshold  that 
the  demonstrator  believes  will  provide  “optimum”  system  performance,  (i.e.,  that  retains  all  the 
detected  ordnance  and  rejects  the  maximum  amount  of  clutter). 

Note:  The  two  lists  provided  by  the  demonstrator  contain  identical  numbers  of  potential  target 
locations.  They  differ  only  in  the  priority  ranking  of  the  declarations. 
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RESPONSE  STAGE  DEFINITIONS 


Response  Stage  Probability  of  Detection  (Pdres):  P/^  =  (No.  of  response-stage  detections)/ 
(No.  of  emplaced  ordnance  in  the  test  site). 

Response  Stage  False  Positive  (fpres):  An  anomaly  location  that  is  within  R^aio  of  an  emplaced 
clutter  item. 

Response  Stage  Probability  of  False  Positive  (Pfpres):  Pfpres  =  (No.  of  response-stage  false 
positives)/(No.  of  emplaced  clutter  items). 

Response  Stage  Background  Alarm  (bares):  An  anomaly  in  a  blind  grid  cell  that  contains  neither 
emplaced  ordnance  nor  an  emplaced  clutter  item.  An  anomaly  location  in  the  open  field  or 
scenarios  that  is  outside  Rhaio  of  any  emplaced  ordnance  or  emplaced  clutter  item. 

Response  Stage  Probability  of  Background  Alarm  (Pbares):  Blind  Grid  only:  Pbares  =  (No.  of 
response-stage  background  alarms)/(No.  of  empty  grid  locations). 

Response  Stage  Background  Alarm  Rate  (BAR1^):  Open  Field  only:  BARres  =  (No.  of 
response-stage  background  alarms)/(arbitrary  constant). 

Note  that  the  quantities  Pdre'\  Pfpres,  Pbares,  and  BARres  are  functions  of  tres,  the  threshold 
applied  to  the  response-stage  signal  strength.  These  quantities  can  therefore  be  written  as 
Pdres(tres),  Pfpres(tres),  Pbares(tres),  and  BARres(tres). 

DISCRIMINATION  STAGE  DEFINITIONS 

Discrimination:  The  application  of  a  signal  processing  algorithm  or  human  judgment  to 
response-stage  data  that  discriminates  ordnance  from  clutter.  Discrimination  should  identify 
anomalies  that  the  demonstrator  has  high  confidence  correspond  to  ordnance,  as  well  as  those 
that  the  demonstrator  has  high  confidence  correspond  to  nonordnance  or  background  returns. 
The  former  should  be  ranked  with  highest  priority  and  the  latter  with  lowest. 

Discrimination  Stage  Probability  of  Detection  (pddisc);  pddisc  =  (No.  of  discrimination-stage 
detections)/(No.  of  emplaced  ordnance  in  the  test  site). 

Discrimination  Stage  False  Positive  (fpdisc):  An  anomaly  location  that  is  within  Rhaio  of  an 
emplaced  clutter  item. 

Discrimination  Stage  Probability  of  False  Positive  (Pfpd,sc):  Pfpdisc  =  (No.  of  discrimination  stage 
false  positives)/(No.  of  emplaced  clutter  items). 

Discrimination  Stage  Background  Alann  (badisc):  An  anomaly  in  a  blind  grid  cell  that  contains 
neither  emplaced  ordnance  nor  an  emplaced  clutter  item.  An  anomaly  location  in  the  open  field 
or  scenarios  that  is  outside  Rhai0  of  any  emplaced  ordnance  or  emplaced  clutter  item. 
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Discrimination  Stage  Probability  of  Background  Alarm  (Pba^):  Pbad'sc  =  (No.  of  discrimination- 
stage  background  alarms)/(No.  of  empty  grid  locations). 

Discrimination  Stage  Background  Alarm  Rate  (BARd,sc):  BARdlsc  =  (No.  of  discrimination-stage 
background  alarms)/(arbitrary  constant). 

Note  that  the  quantities  Pddlsc,  Pfpdisc,  Pbadisc,  and  BARdlsc  are  functions  of  td,sc,  the  threshold 
applied  to  the  discrimination-stage  signal  strength.  These  quantities  can  therefore  be  written  as 
Pddlsc(tdlsc),  Pfpdisc(tdisc),  Pbadisc(tdisc),  and  BARdisc(tdisc). 

RECEIVER-OPERATING  CHARACERISTIC  (ROC)  CURVES 

ROC  curves  at  both  the  response  and  discrimination  stages  can  be  constructed  based  on  the 
above  definitions.  The  ROC  curves  plot  the  relationship  between  Pd  versus  Pfp  and  Pd  versus 
BAR  or  Pba  as  the  threshold  applied  to  the  signal  strength  is  varied  from  its  minimum  (tmin)  to  its 
maximum  (tmax)  value.1  Figure  A-l  shows  how  Pd  versus  Pfp  and  Pd  versus  BAR  are  combined 
into  ROC  curves.  Note  that  the  “res”  and  “disc”  superscripts  have  been  suppressed  from  all  the 
variables  for  clarity. 


Figure  A-l.  ROC  curves  for  open  field  testing.  Each  curve  applies  to  both  the  response  and 
discrimination  stages. 


'Strictly  speaking,  ROC  curves  plot  the  Pd  versus  Pba  over  a  pre-determined  and  fixed  number  of 
detection  opportunities  (some  of  the  opportunities  are  located  over  ordnance  and  others  are 
located  over  clutter  or  blank  spots).  In  an  open  field  scenario,  each  system  suppresses  its  signal 
strength  reports  until  some  bare-minimum  signal  response  is  received  by  the  system. 
Consequently,  the  open  field  ROC  curves  do  not  have  information  from  low  signal-output 
locations,  and,  furthermore,  different  contractors  report  their  signals  over  a  different  set  of 
locations  on  the  ground.  These  ROC  curves  are  thus  not  true  to  the  strict  definition  of  ROC 
curves  as  defined  in  textbooks  on  detection  theory.  Note,  however,  that  the  ROC  curves 
obtained  in  the  Blind  Grid  test  sites  are  true  ROC  curves. 
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METRICS  TO  CHARACTERIZE  THE  DISCRIMINATION  STAGE 


The  demonstrator  is  also  scored  on  efficiency  and  rejection  ratio,  which  measure  the 
effectiveness  of  the  discrimination  stage  processing.  The  goal  of  discrimination  is  to  retain  the 
greatest  number  of  ordnance  detections  from  the  anomaly  list,  while  rejecting  the  maximum 
number  of  anomalies  arising  from  nonordnance  items.  The  efficiency  measures  the  amount  of 
detected  ordnance  retained  by  the  discrimination,  while  the  rejection  ratio  measures  the  fraction 
of  false  alarms  rejected.  Both  measures  are  defined  relative  to  the  entire  response  list,  i.e.,  the 
maximum  ordnance  detectable  by  the  sensor  and  its  accompanying  false  positive  rate  or 
background  alarm  rate. 

Efficiency  (E):  E  =  Pddisc(td,sc)/Pdres(trninres) ;  Measures  (at  a  threshold  of  interest),  the  degree 
to  which  the  maximum  theoretical  detection  performance  of  the  sensor  system  (as  determined  by 
the  response  stage  tmin)  is  preserved  after  application  of  discrimination  techniques.  Efficiency  is 
a  number  between  0  and  1.  An  efficiency  of  1  implies  that  all  of  the  ordnance  initially  detected 
in  the  response  stage  was  retained  at  the  specified  threshold  in  the  discrimination  stage,  t^0. 

False  Positive  Rejection  Rate  (Rfp):  Rfp  =  1  -  [Pfpdisc(tdisc)/Pfpres(tminres)] ;  Measures  (at  a 
threshold  of  interest),  the  degree  to  which  the  sensor  system's  false  positive  performance  is 
improved  over  the  maximum  false  positive  performance  (as  determined  by  the  response  stage 
tmin).  The  rejection  rate  is  a  number  between  0  and  1.  A  rejection  rate  of  1  implies  that  all 
emplaced  clutter  initially  detected  in  the  response  stage  were  correctly  rejected  at  the  specified 
threshold  in  the  discrimination  stage. 

Background  Alarm  Rejection  Rate  (Rba): 

Blind  Grid:  Rba  =  1  -  [Pbadisc(tdisc)/Pbares(tminres)]. 

Open  Field:  Rba  =  1  -  [BARdlsc(tdlsc)/BARres(tminres)]). 

Measures  the  degree  to  which  the  discrimination  stage  correctly  rejects  background  alarms 
initially  detected  in  the  response  stage.  The  rejection  rate  is  a  number  between  0  and  1.  A 
rejection  rate  of  1  implies  that  all  background  alarms  initially  detected  in  the  response  stage  were 
rejected  at  the  specified  threshold  in  the  discrimination  stage. 

CHI-SQUARE  COMPARISON  EXPLANATION: 

The  Chi-square  test  for  differences  in  probabilities  (or  2  x  2  contingency  table)  is  used  to 
analyze  two  samples  drawn  from  two  different  populations  to  see  if  both  populations  have  the 
same  or  different  proportions  of  elements  in  a  certain  category.  More  specifically,  two  random 
samples  are  drawn,  one  from  each  population,  to  test  the  null  hypothesis  that  the  probability  of 
event  A  (some  specified  event)  is  the  same  for  both  populations  (ref  3). 

A  2  x  2  contingency  table  is  used  in  the  Standardized  UXO  Technology  Demonstration 
Site  Program  to  determine  if  there  is  reason  to  believe  that  the  proportion  of  ordnance  correctly 
detected/discriminated  by  demonstrator  X’s  system  is  significantly  degraded  by  the  more 
challenging  terrain  feature  introduced.  The  test  statistic  of  the  2  x  2  contingency  table  is  the 
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Chi-square  distribution  with  one  degree  of  freedom.  Since  an  association  between  the  more 
challenging  terrain  feature  and  relatively  degraded  performance  is  sought,  a  one-sided  test  is 
performed.  A  significance  level  of  0.05  is  chosen  which  sets  a  critical  decision  limit  of 
2.71  from  the  Chi-square  distribution  with  one  degree  of  freedom.  It  is  a  critical  decision  limit 
because  if  the  test  statistic  calculated  from  the  data  exceeds  this  value,  the  two  proportions  tested 
will  be  considered  significantly  different.  If  the  test  statistic  calculated  from  the  data  is  less  than 
this  value,  the  two  proportions  tested  will  be  considered  not  significantly  different. 

An  exception  must  be  applied  when  either  a  0  or  100  percent  success  rate  occurs  in  the 
sample  data.  The  Chi-square  test  cannot  be  used  in  these  instances.  Instead,  Fischer’s  test  is 
used  and  the  critical  decision  limit  for  one-sided  tests  is  the  chosen  significance  level,  which  in 
this  case  is  0.05.  With  Fischer’s  test,  if  the  test  statistic  is  less  than  the  critical  value,  the 
proportions  are  considered  to  be  significantly  different. 

Standardized  UXO  Technology  Demonstration  Site  examples,  where  blind  grid  results  are 
compared  to  those  from  the  open  field  and  open  field  results  are  compared  to  those  from  one  of 
the  scenarios,  follow.  It  should  be  noted  that  a  significant  result  does  not  prove  a  cause  and 
effect  relationship  exists  between  the  two  populations  of  interest;  however,  it  does  serve  as  a  tool 
to  indicate  that  one  data  set  has  experienced  a  degradation  in  system  performance  at  a  large 
enough  level  than  can  be  accounted  for  merely  by  chance  or  random  variation.  Note  also  that  a 
result  that  is  not  significant  indicates  that  there  is  not  enough  evidence  to  declare  that  anything 
more  than  chance  or  random  variation  within  the  same  population  is  at  work  between  the  two 
data  sets  being  compared. 

Demonstrator  X  achieves  the  following  overall  results  after  surveying  each  of  the  three 
progressively  more  difficult  areas  using  the  same  system  (results  indicate  the  number  of 
ordnance  detected  divided  by  the  number  of  ordnance  emplaced): 

Blind  Grid  Open  Field  Moguls 

Pdres  100/100  =  1.0  8/10  =  .80  20/33  =  .61 

Pddisc  80/100  =  0.80  6/10  =  .60  8/33  =  .24 

Pdres:  BLIND  GRID  versus  OPEN  FIELD.  Using  the  example  data  above  to  compare 
probabilities  of  detection  in  the  response  stage,  all  100  ordnance  out  of  100  emplaced  ordnance 
items  were  detected  in  the  blind  grid  while  8  ordnance  out  of  10  emplaced  were  detected  in  the 
open  field.  Fischer’s  test  must  be  used  since  a  100  percent  success  rate  occurs  in  the  data. 
Fischer’s  test  uses  the  four  input  values  to  calculate  a  test  statistic  of  0.0075  that  is  compared 
against  the  critical  value  of  0.05.  Since  the  test  statistic  is  less  than  the  critical  value,  the  smaller 
response  stage  detection  rate  (0.80)  is  considered  to  be  significantly  less  at  the  0.05  level  of 
significance.  While  a  significant  result  does  not  prove  a  cause  and  effect  relationship  exists 
between  the  change  in  survey  area  and  degradation  in  performance,  it  does  indicate  that  the 
detection  ability  of  demonstrator  X’s  system  seems  to  have  been  degraded  in  the  open  field 
relative  to  results  from  the  blind  grid  using  the  same  system. 
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Pddisc:  BLIND  GRID  versus  OPEN  FIELD.  Using  the  example  data  above  to  compare 
probabilities  of  detection  in  the  discrimination  stage,  80  out  of  100  emplaced  ordnance  items 
were  correctly  discriminated  as  ordnance  in  blind  grid  testing  while  6  ordnance  out  of 
10  emplaced  were  correctly  discriminated  as  such  in  open  field-testing.  Those  four  values  are 
used  to  calculate  a  test  statistic  of  1.12.  Since  the  test  statistic  is  less  than  the  critical  value  of 
2.71,  the  two  discrimination  stage  detection  rates  are  considered  to  be  not  significantly  different 
at  the  0.05  level  of  significance. 

Pdres:  OPEN  FIELD  versus  MOGULS.  Using  the  example  data  above  to  compare 
probabilities  of  detection  in  the  response  stage,  8  out  of  10  and  20  out  of  33  are  used  to  calculate 
a  test  statistic  of  0.56.  Since  the  test  statistic  is  less  than  the  critical  value  of  2.71,  the  two 
response  stage  detection  rates  are  considered  to  be  not  significantly  different  at  the  0.05  level  of 
significance. 

Pddisc:  OPEN  FIELD  versus  MOGULS.  Using  the  example  data  above  to  compare 
probabilities  of  detection  in  the  discrimination  stage,  6  out  of  10  and  8  out  of  33  are  used  to 
calculate  a  test  statistic  of  2.98.  Since  the  test  statistic  is  greater  than  the  critical  value  of  2.71, 
the  smaller  discrimination  stage  detection  rate  is  considered  to  be  significantly  less  at  the 
0.05  level  of  significance.  While  a  significant  result  does  not  prove  a  cause  and  effect 
relationship  exists  between  the  change  in  survey  area  and  degradation  in  performance,  it  does 
indicate  that  the  ability  of  demonstrator  X  to  correctly  discriminate  seems  to  have  been  degraded 
by  the  mogul  terrain  relative  to  results  from  the  flat  open  field  using  the  same  system. 
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APPENDIX  B.  DAILY  WEATHER  LOGS 


TABLE  B-l.  WEATHER  LOG 


Time 

Average 
Temperature,  °C 

Average 

Precipitation,  in. 

10/18/2004 

0700 

17.7 

0.00 

0800 

18.4 

0.00 

0900 

21.0 

0.00 

1000 

22.9 

0.00 

1100 

24.3 

0.00 

1200 

25.4 

0.00 

1300 

25.7 

0.00 

1400 

26.2 

0.00 

1500 

26.2 

0.00 

1600 

26.2 

0.00 

1700 

25.9 

0.00 

10/19/2004 

0700 

NA 

NA 

0800 

NA 

NA 

0900 

NA 

NA 

1000 

NA 

NA 

1100 

NA 

NA 

1200 

NA 

NA 

1300 

NA 

NA 

1400 

NA 

NA 

1500 

NA 

NA 

1600 

NA 

NA 

1700 

NA 

NA 

10/20/2004 

0700 

18.2 

0.00 

0800 

19.8 

0.00 

0900 

22.4 

0.00 

1000 

23.6 

0.00 

1100 

25.0 

0.00 

1200 

25.5 

0.00 

1300 

26.3 

0.00 

1400 

26.5 

0.00 

1500 

25.8 

0.00 

1600 

25.5 

0.00 

1700 

23.9 

0.00 
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APPENDIX  C.  SOIL  MOISTURE 


Date:  18  October  2004 
Times:  NA,  1300  hours 


Probe  Location 

Layer,  in. 

PM  Reading,  % 

Calibration  Area 

0  to  6 

NA 

1.6 

6  to  12 

NA 

2.2 

12  to  24 

NA 

3.7 

24  to  36 

NA 

3.6 

36  to  48 

NA 

4.1 

Mogul  Area 

0  to  6 

NA 

1.6 

6  to  12 

NA 

2.1 

12  to  24 

NA 

3.4 

24  to  36 

NA 

3.9 

36  to  48 

NA 

4.0 

Desert  Extreme  Area 

0  to  6 

NA 

1.6 

6  to  12 

NA 

2.3 

12  to  24 

NA 

3.2 

24  to  36 

NA 

3.9 

36  to  48 

NA 

4.0 

Date:  19  October  2004 
Times:  0630  hours,  1300  hours 


Probe  Location 

AM  Reading,  % 

PM  Reading,  % 

Calibration  Area 

0  to  6 

1.8 

1.8 

6  to  12 

2.2 

2.2 

12  to  24 

3.7 

3.7 

24  to  36 

3.6 

3.6 

36  to  48 

4.1 

4.1 

Mogul  Area 

0  to  6 

1.6 

1.6 

6  to  12 

2.0 

2.1 

12  to  24 

3.6 

3.4 

24  to  36 

3.9 

4.0 

36  to  48 

4.0 

4.0 

Desert  Extreme  Area 

0  to  6 

1.7 

1.6 

6  to  12 

2.0 

1.8 

12  to  24 

3.4 

3.2 

24  to  36 

3.9 

3.9 

36  to  48 

4.1 

4.0 
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Date:  20  October  2004 
Times:  0645  hours,  1230  hours 


Probe  Location 

Layer,  in. 

AM  Reading,  % 

PM  Reading,  % 

Calibration  Area 

0  to  6 

1.8 

1.8 

6  to  12 

2.2 

2.2 

12  to  24 

3.7 

3.7 

24  to  36 

3.6 

3.6 

36  to  48 

4.1 

4.1 

Mogul  Area 

0  to  6 

1.6 

1.6 

6  to  12 

2.0 

2.0 

12  to  24 

3.4 

3.4 

24  to  36 

3.9 

3.9 

36  to  48 

4.0 

4.0 

Desert  Extreme  Area 

0  to  6 

1.7 

1.6 

6  to  12 

2.0 

1.8 

12  to  24 

3.4 

3.2 

24  to  36 

3.9 

3.9 
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APPENDIX  D.  DAILY  ACTIVITY  LOGS 
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Note:  Activities  pertinent  to  this  specific  demonstration  are  indicated  in  highlighted  text. 
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Note:  Activities  pertinent  to  this  specific  demonstration  are  indicated  in  highlighted  text. 
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APPENDIX  F.  ABBREVIATIONS 


AEC 

APG 

ATC 

CEHNC 

COTS 

CRADA 

EM 

ERDC 

ESTCP 

EQT 

GPS 

JPG 

PDOP 

POC 

QA 

QC 

ROC 

RTK 

RTS 

SERDP 

STOLS 

UTM 

UXO 

YPG 


U.S.  Army  Environmental  Center 
Aberdeen  Proving  Ground 
U.S.  Army  Aberdeen  Test  Center 
Corps  of  Engineers  -  Huntsville  Center 
commercial  off-the-wall 

Cooperative  Research  and  Development  Agreement 
electromagnetic 

U.S.  Army  Corps  of  Engineers  Engineering  Research  and  Development  Center 

Environmental  Security  Technology  Certification  Program 

Army  Environmental  Quality  Technology  Program 

Global  Positioning  System 

Jefferson  Proving  Ground 

Position  Dilution  of  Precesssion 

point  of  contact 

quality  assurance 

quality  control 

receiver-operating  characteristic 
real  time  kinematic 
Robotic  Total  Station 

Strategic  Environmental  Research  and  Development  Program 

Surface  Towed  Ordnance  Location  System 

universal  transverse  mercator 

unexploded  ordnance 

U.S.  Army  Yuma  Proving  Ground 
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APPENDIX  G.  DISTRIBUTION  LIST 

DTC  Project  No.  8-CO-160-UXO-021 

Addressee 

No.  of 
Copies 

Commander 

U.S.  Army  Environmental  Center 

ATTN:  SFIM-AEC-ATT  (Mr.  George  Robitaille) 

Aberdeen  Proving  Ground,  MD  21010-5401 

2 

GEO-CENTERS,  Inc. 

ATTN:  (Mr.  Rob  Spiegel) 

7  Wells  Avenue 

Newton,  MS  02459 

1 

SERDP/ESTCP 

ATTN:  (Ms.  Anne  Andrews) 

901  Norht  Stuart  Street,  Suite  303 

Arlington,  VA  22203 

Commander 

U.S.  Army  Aberdeen  Test  Center 

1 

ATTN:  CSTE-DTC-SL-E  (Mr.  Larry  Overbay) 

1 

(Library) 

1 

CSTE-DTC- AT -CS-R 

Aberdeen  Proving  Ground,  MD  21005-5059 

1 

Defense  Technical  Information  Center 

8725  John  J.  Kingman  Road,  STE  0944 

Fort  Belvoir,  VA  22060-6218 

2 

Secondary  distribution  is  controlled  by  Commander,  U.S.  Army  Environmental  Center, 

ATTN:  SFIM-AEC-ATT. 
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